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A method of DNA sequencing 

The present invention relates to methods of nucleic 
acid sequencing and in particular to sequencing-by- 
synthesis methods, ie . those methods based on the 
detection of nucleotide incorporation during polymerase 
extension, rather than on analysis of the nucleotide 
sequence itself, and to the improvements derivable in 
such methods by the use of a single-stranded DNA binding 
protein, 

DNA sequencing is an essential tool in molecular 
genetic analysis. The ability to determine DNA 
nucleotide sequences has become increasingly important 
as efforts have commenced to determine the sequences of 
the large genomes of humans and other higher organisms. 
The two most commonly used methods for DNA sequencing 
are the enzymatic chain- termination method of Sanger and 
the chemical cleavage technique of Maxam and Gilbert. 
Both methods rely on gel electrophoresis to resolve, 
according to their size, DNA fragments produced from a 
larger DNA segment. Since the electrophoresis step as 
well as the subsequent detection of the separated DNA- 
fragments are cumbersome procedures, a great effort has 
been made to automate these steps. However, despite the 
fact that automated electrophoresis units are 
commercially available, electrophoresis is not well 
suited for large-scale genome projects or clinical 
sequencing where relatively cost-effective units with 
high throughput are needed. Thus, the need for non- 
elect rophore tic methods for sequencing is great and 
several alternative strategies have been described, such 
as scanning tunnel electron microscopy (Driscoll et al . , 
1990, Nature, 346, 294-296), sequencing by hybridization 
(Bains et al . , 1988, J. Theo. Biol. 135, 308-307) and 
single molecule detection (Jeff et al . , 1989, Biomol . 
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Struct. Dynamics, 7, 301-306), to overcome the 
disadvantages of electrophoresis. 

Techniques enabling the rapid detection of a single 
DNA base change are also important tools for genetic 
analysis. In many cases detection of a single base or a 
few bases would be a great help in genetic analysis 
since several genetic diseases and certain cancers are 
related to minor mutations. 

Sequencing-by- synthesis methods are useful ways of 
determining the sequence of a DNA molecule of up to a 
hundred or more bases or the identity of a single 
nucleotide within a sample DNA molecule. During typical 
sequencing-by-synthesis methods the four different 
nucleotides (adenine, thymine, guanine and cytosine) are 
conveniently added cyclically in a specific order; when 
the base which forms a pair (according to the normal 
rules of base pairing, A-T and C-G) with the next base 
in the single- strand target sequence is added, it will 
be incorporated into the growing complementary strand by 
a polymerase and this incorporation will trigger a 
detectable signal . The event of incorporation' can be 
detected directly or indirectly. In direct detection, 
nucleotides are usually f luorescently labelled allowing 
analysis by a fluorometer. (US Patent 48638449, US 
Patent 5302509, Metzker et al . Nucl, Acids Res. (1994) 
22: 4259-4267, Rosenthal International Patent 
Application No. WO 93/213401, WO 91/06678, Canard et al . 
Gene (1994) 148: 1-6). One such strategy of sequencing- 
by-synthesis called base addition sequencing scheme 
(BASS) is based on nucleotide analogues that terminate 
DNA synthesis. BASS involves repetitive cycles of 
incorporation of each successive nucleotide, in situ 
monitoring to identify the incorporated base, and 
deprotection to allow the next cycle of DNA synthesis. 

Indirect detection usually takes advantage of 
enzymatic detection, e.g. measuring the release of PPi 
(inorganic pyrophosphate) during a polymerization 
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reaction (WO 93/23564 and WO 89/09283) . As each 
nucleotide is added to a growing nucleic acid strand 
during a polymerase reaction, a pyrophosphate molecule 
is released. It has been found that pyrophosphate 
released under these conditions can be detected 
enzymatically e.g. by the generation of light in the 
luciferase-luciferin reaction. Such methods enable a 
base to be identified in a target position and DNA to be 
sequenced simply and rapidly whilst avoiding the need 
for electrophoresis and the use of harmful radiolabels. 
These methods based on release of PPi are referred to 
herein as Pyrosequencing . The basic PPi -based 
sequencing methods have been improved by using a dATP 
analogue in place of dATP (WO 98/13523) and including a 
nucleotide-degrading enzyme such as apyrase during the 
polymerase reaction step, so that unincorporated 
nucleotides are degraded, as described in WO 98/28440. 

However, these sequencing-by- synthesis methods 
mentioned above are not without drawbacks. A particular 
problem arises when the DNA to be sequenced has a number 
of identical adjacent bases, especially 3 or more the 
same. Figure 1 shows the trace obtained when a single- 
stranded PGR product is sequenced according to known 
sequencing-by-synthesis methods (in this case involving 
detection of PPi). Figure. 1 shows that known methods do 
not provide clear results when two or more adjacent 
bases in the sample molecule are the same. For example, 
the peak height when the first set of three adenine 
residues are incorporated is almost the same as when 
four thymine residues are incorporated later; the set of 
three adenine residues incorporated around the middle of 
the sequence have the same peak height as previous 
doublets and the last pair of adenine residues to be 
incorporated have a peak height corresponding to single 
bases from the earlier part of the sequence. 

Other problems of sequencing-by- synthesis methods 
include false signals which are the result of 
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mispriming, i.e. hybridisation of the primer not to its 
targeted complement within the target DNA sequence but 
to another region which will result in generation of 
"incorporation signals" which do not reflect the 
identity of the target sequence. There is an associated 
problem which can result in a false indication of 
incorporation termed "minus frame incorporation" , where 
a proportion of the growing primer originating strands 
are not fully extended and false positive signals appear 
in subsequent cycles . 

Thus, there is a need further to improve 
sequencing-by-synthesis methods by addressing the above ^ 
problems and more generally to improve the accuracy of 
the methods while providing methods which are simple and 
quick to perform, lending themselves readily to 
automation. 

It has surprisingly been found that including a 
single-stranded nucleic acid binding protein in the 
reaction mixture improves the ratio of signals generated 
by one, two, three or more adjacent bases and reduces 
the number of false signals and generally improves the 
efficacy and reduces the cost of sequencing-by- synthesis 
methods . 

In one aspect, the present invention thus provides 
a method of identifying a base at a target position in a 
sample nucleic acid sequence wherein a primer, which 
hybridises to the sample nucleic acid immediately 
adjacent to the target position, is provided and the 
sample nucleic acid and primer are subjected to a 
polymerase reaction in the presence of a nucleotide 
whereby the nucleotide will only become incorporated if 
it is complementary to the base in the target position, 
and said incorporation is detected, characterised in 
that, a single- stranded nucleic acid binding protein is 
included in the polymerase reaction step. 

The nucleic acid to be sequenced may be any 
nucleotide sequence it is desirable to obtain sequence 



9 

WO 00/43540 PCT/GBOO/00146 

- 5 - 

information about. Thus, it may be any polynucleotide, 
or indeed oligonucleotide sequence. The nucleic acid 
may be DNA or RNA, and may be natural, isolated or 
synthetic. Thus, the target DNA may be genomic DNA, or 
cDNA, or a PGR product or other amplicon etc. 
Alternatively, the target DNA may be synthetic, and 
genomic DNA, cDNA or a PGR product etc. may be used as 
primer. The target (sample) nucleic acid may be used in 
any convenient form, according to techniques known in 
the art e.g. isolated, cloned, amplified etc., and may 
be prepared for the sequencing reaction, as desired, 
according to techniques known in the art. 

The DNA may also be single or double- stranded - 
whilst a single-stranded DNA template has traditionally 
been used in sequencing reactions, or indeed in any 
primer-extension reaction, it is possible to use a 
double- stranded template; strand displacement, or a 
localised opening-up of the two DNA strands may take 
place to allow primer hybridisation and polymerase 
action to occur. 

The sample nucleic acid acts as a template for 
possible polymerase based extension of the primer and 
thus may conveniently be referred to as "template" or 
"nucleic acid template". 

In the polymerase reaction, any convenient 
polymerase enzyme may be used according to choice, as 
will be described in more detail below. In the case of 
a RNA template, such a polymerase enzyme may be a 
reverse transcriptase enzyme. The nucleotide may be any 
nucleotide sutiable for a polymerase chain extension 
reaction e.g. a deoxynucleot ide or a dideoxynucleotide . 
The nucleotide may optionally be labelled in order to 
aid or facilitate detection of nucleotide incorporation. 
One or more nucleotides may be used. 

Nucleotide incorporation by the action of the 
polymerase enzyme may be detected directly or 
indirectly, and methods for this are well known in the 
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art. Representative methods are described for example 
in US-A-4, 863, 879 of Melamede . As mentioned above, 
detection of incorporation may be by means of labelled 
nucleotides, for example f luorescently labelled 
nucleotides, as is well known in sequencing procedures 
known in the art. Alternatively, the event of 
incorporation may be detected by other means e.g. 
indirectly. Detection of incorporation also includes 
the detection of absence of incorporation e.g. lack of a 
signal. Thus, it may be detected whether or not 
nucleotide incorporation takes place. 

The method of the invention thus has utility in a 
number of different sequencing methods and formats, 
including mini -sequencing procedures e.g. detection of 
single base changes (for example, in detecting point 
mutations, or polymoirphisms , or allelic variations etc) . 
The method of the invention may thus be used in a "full" 
sequencing procedure, ie. the identification of the 
sequential order of the bases in a stretch of 
nucleotides, as well in single base detection 
procedures . 

For example, to determine sequence information in a 
target nucleotide sequence, different deoxynucleotides 
or dideoxynucleotides may be added either to separate 
aliquots of sample-primer mixture or successively to the 
same sample-primer mixture and subjected to the 
polymerase reaction to indicate which deoxynucleotide or 
dideoxynucleotide is incorporated. 

In order to sequence the target DNA, the procedure 
may be repeated one or more times i.e. cyclically, as is 
known in the art. In this way the identity of many 
bases in the sample nucleic acid may be identified, 
essentially in the same reaction. 

Hence, a sequencing protocol may involve annealing 
a primer as described above, performing a polymerase- 
catalysed primer extension step, detecting the presence 
or absence of incorporation, and repeating the 
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nucleotide addition and primer extension steps etc. one 
or more times. As discussed above, nucleotides may be 
added singly or individually, or in a mixture, 
successively to the same primer- template mixture, or to 
separate aliquot s of primer- template mixture, or to 
separate aliquots of primer- template mixture etc. 
according to choice, and the sequence information it is 
desired to obtain. 

The term "single-stranded nucleic acid binding 
protein" as used herein is intended to refer to the 
class of proteins collectively referred to by the term 
SSB (Ann. Rev. Biochem. [1986] 5^ 103-136 Chase et al . ) . 
SSB has the general property of preferential binding to 
single- stranded (ss) over double -stranded (ds) nucleic 
acid molecules. The class includes E. coli single- 
stranded binding protein (Eco SSB) , T4 gene 32 protein 
(T4 gp32) , T7 SSB, coliophage N4 SSB, T4 gene 44/62 
protein, adenovirus DNA binding protein (AdDBP or 
AdSSB) , calf thymus unwinding protein (UPl) and the like 
(Coleman et al . CRC Critical Reviews in Biochemistry, 
(1980) 7(3), 247-289 and p5 SSB from fi-29 DNA (Lindberg 
et al. J. Biol. Chem. (1989) 264 12700-08) (Nakashima et 
al. FEBS Lett. (1974) 13 125). Any functionally 
equivalent or analogous protein, for example derivatives 
or modifications of the above-mentioned proteins, may 
also be used. Eco SSB or derivatives thereof are 
particularly preferred for use in the methods of the 
present invention. 

Thus, modified single -stranded nucleic acid binding 
proteins derived by isolation of mutants or by 
manipulation of cloned single -stranded nucleic acid 
binding proteins which maintain the advantageous 
properties described herein, are also contemplated for 
use in the methods of the invention. 

The term "dideoxynucleotide" as used herein 
includes all 2 ' -deoxynucleotides in which the 3 ' - 
hydroxyl group is absent or modified and thus, while 



^^««^43540 PCT/GBOO/00146 

- 8 - 

able to be added to the primer in the presence of the 
polymerase, is unable to enter into a subsequent 
polymerisation reaction. 

As described above, the method of the invention may 
be performed in a number of ways, and has utility in a 
variety of sequencing protocols. Viewed more generally, 
the present invention can thus be seen to provide the 
use of a single- stranded nucleic acid binding protein in 
a nucleic acid sequencing-by- synthesis method. In 
particular, the single-stranded nucleic acid binding 
protein is used to bind to the nucleic acid template. 

What is meant by a DNA sequencing-by- synthesis 
method is defined above, namely that sequence 
information is derived by detecting incorporation of a 
nucleotide in a primer extension reaction. As explained 
above, such sequencing-by-synthesis protocols, include 
not only "full" sequencing methods, but also mini- 
sequencing methods etc., yielding more limited sequence 
information. 

Any sequencing-by- synthesis method, as described 
above, is suitable for use in the methods of the present 
invention but methods which rely on monitoring the 
release of inorganic pyrophosphate (PPi) are 
particularly preferred. In this case, incorporation of 
the nucleotide will be measured indirectly by enzymatic 
detection of released PPi. 

PPi can be determined by many different methods and 
a number of enzymatic methods have been described in the 
literature (Reeves et al . . (1969), Anal. Biochem. , 28, 
282-287; Guillory e^_al . , (1971), Anal. Biochem. , 39, 
170-180; Johnson et al . . (1968), Anal. Biochem., 15, 
273; Cook et al . , (1978), Anal. Biochem. 91, 557-565; 
and Drake Qt ^1 . , (1979), Anal. Biochem. 94, 117-120). 

It is preferred to use luciferase and luciferin in 
combination to identify the release of pyrophosphate 
since the amount of light generated is substantially 
proportional to the amount of pyrophosphate released 
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which, in turn, is directly proportional to the amount 
of base incorporated. The amount of light can readily 
be estimated by a suitable light sensitive device such 
as a luminometer. 

Luciferin-lucif erase reactions to detect the 
release of PPi are well known in the art. In 
particular, a method for continuous monitoring of PPi 
release based on the enzymes ATP sulphurylase and 
lucif erase has been developed by Nyren and Lundin (Anal. 
Biochem., 151, 504-509, 1985) and termed ELIDA 
(Enzymatic Luminometric Inorganic Pyrophosphate 
Detection Assay) . The use of the ELIDA method to detect 
PPi is preferred according to the present invention. 
The method may however be modified, for example by the 
use of a more thermostable luciferase (Kaliyama et al . , 
1994, Biosci. Biotech. Biochem., 58, 1170-1171). This 
method is based on the following reactions: 



ATP sulphurylase 
PPi + APS > ATP + so!" 



luciferase 

ATP + lucif erin + O2 > AMP + PPi + 

oxyluciferin + CO2 + hv 

(APS = adenosine 5 * -phosphosulphate) 



The preferred detection enzymes involved in the PPi 
detection reaction are thus ATP sulphurylase and 
luciferase. Methods of detecting the light emitted are 
well known in the art. 

in order to repeat the method cyclically and 
thereby sequence the sample nucleic acid and, also to 
aid separation of a single-stranded sample DNA from its 
complementary strand, it may be desirable that sample 
nucleic acid (DNA) is immobilised or provided with means 
for immobilisation attachment to a solid support. 
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Moreover, the amount of sample nucleic acid 
available may be small and it may therefore be desirable 
to amplify the sample nucleic acid before carrying out 
the method according to the invention. 

The sample DNA may be amplified, for example in 
vitro by PGR, Self Sustained Sequence Replication (3SR) , 
Rolling Circle Amplification or Replication (RCA or 
RCR) , or indeed any other in vitro amplification 
technique, or in vivo using a vector and, if desired, in 
vitro and in vivo amplification may be used in 
combination. Whichever method of amplification is used, 
it may be convenient to adapt the method such that the 
amplified nucleic acid becomes immobilised or is 
provided with means for attachment to a solid support. 
For example, a PGR primer may be immobilised or be 
provided with means for attachment to a solid support. 
Also, a vector may comprise means for attachment to a 
solid support adjacent the site of insertion of the 
sample DNA such that the amplified sample DNA and the 
means for attachment may be excised together. 

Immobilisation of the amplified DNA may take place 
as part of the amplification itself, e.g. in PGR where 
one or more primers are attached to a support, or 
alternatively one or more of the primers may carry means 
for immobilisation e.g. a functional group permitting 
subsequent immobilisation, e.g. a biotin or thiol group. 
Immobilisation by the 5' end of a primer allows the 
strand of DNA emanating from that primer to be attached 
to a solid support and have its 3 ' end remote from the 
support and available for subsequent hybridisation with 
the extension primer and chain extension by polymerase. 

The solid support may conveniently take the form of 
microtitre wells, or dipsticks which may be made of 
polystyrene activated to bind the primer DNA (K Aimer, 
Doctoral Theses, Royal Institute of Technology, 
Stockholm, Sweden, 1988) . However, any solid support 
may conveniently be used, including any of the vast 
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number described in the art, e.g. for separation/ 
immobilisation reactions or solid phase assays. Thus, 
the support may also comprise particles, fibres or 
capillaries made, for example, of any polymer, e.g. 
agarose, cellulose, alginate. Teflon or polystyrene. 
Glass solid supports can also be used, e.g. glass plates 
or capillaries. Magnetic particles e.g. the 
superparamagnetic beads produced by Dynal AS (Oslo, 
Norway) are a preferred support since they can be 
readily isolated from a reaction mixture yet have 
superior reaction kinetics over many other forms of 
support . 

The solid support may carry functional groups such 
as hydroxyl, carboxyl , aldehyde or amino groups, or 
other moieties such as avidin or streptavidin, for the 
attachment of primers or the target nucleic acid. These 
may in general be provided by treating the support to 
provide a surface coating of a polymer carrying one of 
such functional groups, e.g. polyurethane together with 
a polyglycol to provide hydroxyl groups, or a cellulose 
derivative to provide hydroxyl groups, a polymer or 
copolymer of acrylic acid or methacrylic acid to provide 
carboxyl groups or an aminoalkylated polymer to provide 
amino groups. Sulphur and epoxy-based functional groups 
may also be used. US Patent No. 46542 67 describes the 
introduction of many such surface coatings. 

The assay technique is very simple and rapid, thus 
making it easy to automate by using a robot apparatus 
where a large number of samples may be rapidly analysed. 
Since the preferred detection and quantification is 
based on a luminometric reaction this can be easily 
followed spectrophotometrically . The use of 
luminometers is well known in the art and descri-bed in 
the literature. 

. As mentioned above, the sample nucleic acid, ie . 
the target nucleic acid to be sequenced, may be any 
nucleotide sequence, however obtained according to 
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techniques knovm in the art, e.g. cloning, DNA isolation 
etc. It may for example be cDNA synthesised from RNA in 
the sample and the method of the invention is thus 
applicable to diagnosis on the basis of characteristic 
RNA. Such preliminary synthesis can be carried out by a 
preliminary treatment with a reverse transcriptase, 
conveniently in the same system of buffers and bases of 
subsequent amplification e.g. PGR steps, if used. Since 
the PGR procedure requires heating to effect strand 
separation, in the case of PGR the reverse transcriptase 
will be inactivated in the first PGR cycle. When mRNA 
is the sample nucleic acid, it may be advantageous to 
submit the initial sample, e.g. a serum sample, to 
treatment with an immobilised polydT oligonucleotide in 
order to retrieve all mRNA via the terminal polyA 
sequences thereof. Alternatively, a specific 
oligonucleotide sequence may be used to retrieve the RNA 
via a specific RNA sequence. The oligonucleotide can 
then serve as a primer for cDNA synthesis, as described 
in WO 89/0982. 

Advantageously, the primer for the polymerase chain 
extension step (ie., the extension primer) is 
sufficiently large to provide appropriate hybridisation 
with the sequence immediately 5' of the target position, 
yet still reasonably short in order to avoid unnecessary 
chemical synthesis. It will be clear to persons skilled 
in the art that the size of the extension primer and the 
stability of hybridisation will be dependent to some 
degree on the ratio of A-T to G-G base pairings, since 
more hydrogen bonding is available in a C-G pairing. 
Also, the skilled person will consider the degree of 
homology between the extension primer to other parts of 
the amplified sequence and choose the degree of 
stringency accordingly. Guidance for such routine 
experimentation can be found in the literature, for 
example. Molecular Gloning: a laboratory manual by 
Sambrook, J., Fritsch E.F. and Maniatis, T. (1989). 
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The primer is conveniently added before the sample 
is divided into (four) aliquots although it may be added 
separately to each aliquot. It should be noted that the 
extension primer may be identical with the PGR primer 
but advantageously it may be different, to introduce a 
further element of specificity into the system. 

Where appropriate, the polymerase reaction is 
carried out using a polymerase which will incorporate 
deoxynucleotides and dideoxynucleotides , e.g. T7 
polymerase, Klenow, 02 9 DNA polymerase or Sequenase Ver. 
2,0 (USB U.S.A.) . Any suitable polymerase may be used 
and many are known in the art and reported in the 
literature. However, it is known that many polymerases 
have a proof-reading or error checking -ability and that 
3 ' ends available for chain extension are sometimes 
digested by one or more nucleotides. If such digestion 
occurs in the method according to the invention the 
level of background noise increases. In order to avoid 
this potential problem, a nonproof -reading polymerase, 
e.g. T7 polymerase or Sequenase may be used. Otherwise 
it is desirable to add to each aliquot fluoride ions or 
nucleotide monophosphates which suppress 3 ' digestion by 
polymerase . 

A fuller description of preferred embodiments of 
PPi based sequencing-by- synthesis methods are provided 
in WO 98/13523 and WO 98/28440 which are incorporated 
herein by reference. The use of a dATP analogue such as 
dATPaS in place of dATP is advantageous as it does not 
interfere with the detection reaction as it is capable 
of acting as a substrate for a polymerase but incapable 
of acting as a substrate for the PPi -detection enzyme 
lucif erase. It is therefore possible to perform the 
chain extension and detection, or signal -generation, 
reactions substantially simultaneously by including the 
"detection enzymes" in the chain extension reaction 
mixture and therefore the sequencing reactions can be 
continuously monitored in real-time, with a signal being 
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generated and detected, as each nucleotide is 
incorporated . 

Inclusion of a nucleotide degrading enzyme in the 
reaction mix is also advantageous as it means that it is 
not necessary to wash the template thoroughly between 
each nucleotide addition to remove all non- incorporated 
deoxynucleotides, which has the associated benefit that 
a template can be sequenced which is not bound to a 
solid support. 

In a particularly preferred method of. the 
invention, the nucleotide-degrading enzyme apyrase is 
included during the polymerase reaction step and it has 
been found that the single-stranded binding proteins 
used in the methods of the present invention are able to 
stimulate the activity of apyrase. Whilst not wishing 
to be bound by theory, it is believed that the "SSB" may 
play a role in reducing the inhibition of apyrase which 
may be observed in the presence of DNA. 

Thus in a further aspect, the present invention 
provides a method of enhancing the activity of a 
nucleotide-degrading enzyme when used in a nucleic acid 
sequencing-by-synthesis method, which comprises the use 
of a single- stranded nucleic acid binding protein. More 
particularly, the methods comprise including or adding a 
single-stranded nucleic acid binding protein to the 
sequencing reaction mixture (i.e. the template, primer, 
polymerase and/or nucleotide (e.g.dNTP/ddNTP) mix) . 

A similar enhancing effect has been observed on the 
lucif erase enzyme which may be used in signal detection 
and therefore in a further aspect the present invention 
provides a method of enhancing the activity of 
luciferase when used as a detection enzyme in a nucleic 
acid sequencing-by- synthesis method which comprises the 
use of a single -stranded nucleic acid binding protein. 
Again, such methods involve including or adding a 
single-stranded nucleic acid binding protein to the 
sequencing reaction mixture. 
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In a preferred embodiment of the present invention 
the single-stranded nucleic acid binding protein is 
added after hybridisation of the primer to the template 
nucleic acid molecule. 

It is also preferred, not to remove the single- 
stranded nucleic acid binding protein after it has been 
added . 

It is a particular advantage of the present 
sequencing methods that there need be no separation of 
the different reagents and enzymes involved in the 
extension and detection reactions but the labelled or 
unlabelled nucleotides or nucleotide analogues, sample, 
polymerase and where appropriate enzymes and enzyme 
substrates, as well as the single-stranded nucleic acid 
binding protein can be included in the reaction mixture 
and there is no need to remove the single -stranded 
nucleic acid binding protein for detection to take 
place . 

The reaction mixture for the polymerase chain 
extension step may optionally include additional 
ingredients or components if desired. Thus, such 
additional components can include other substances or 
molecules which bind to DNA. Thus, amines such as 
spermidine may be used. It was observed that improved 
results may be obtained using spermidine in run-off 
extension reactions. 

Alternative additional components include DNA 
binding proteins such as RecA. In particular, it has 
been observed that a synergy occurs between RecA and a 
single-stranded nucleic acid binding protein, leading to 
improved results. Accordingly, the combination of RecA 
with a single-stranded nucleic acid binding protein 
represents a preferred embodiment according to the 
present invention. Other DNA binding proteins involved 
in DNA replication, recombination, or structural 
organisation may also be used in similar manner. 

Other components which may be included, 
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particularly when a double stranded substrate is used, 
include DMSO and formamide, and other agents which may 
destabilise or assist in destabilising double-strand 
formation, for example accessory proteins involved in 
DNA replication such as helicase. 

When a single strand nucleic acid binding protein 
is used in accordance with the methods of the present 
invention, the methods are robust and results are 
readily reproducible. The read- length, i.e. the length 
of nucleic acid which can be successfully sequenced, has 
been increased by four times as compared to previous 
sequencing-by-synthesis methods. The methods of the 
invention are suitable for midi- sequencing, ie , the 
sequencing of nucleic acid molecules having 9-50 bases, 
mini -sequencing, the detection of single bases such as 
SNPs (single nucleotide polymorphisms) responsible for 
genetic diseases and the sequencing of nucleic acid 
molecules of 100 bases or more. It is in the successful 
sequencing of larger molecules that the benefits of a 
single strand nucleic acid binding protein in, 
particularly PPi based, sequencing methods are observed. 
The problems of maintaining a constant signal intensity 
for incorporation of one nucleotide, two nucleotides and 
so on, throughout the whole sequencing run are overcome. 

In particular, the present invention may 
advantageously be used in the sequencing of 25 or more, 
advantageously 3 0 or more, 50 or more, or 60 or more 
bases . 

As well as increasing the read length, the use of a 
single stranded nucleic acid binding protein enables the 
use of longer template molecules. Thus, sequence 
information from e.g. a 5 0 base region within a template 
molecule of 400 or more bases can be obtained. Template 
molecules of 800 or more, even 1200 or 1500 or more 
bases can be used in the methods of the present 
invention. 

The amount of sequencing template accessible for 
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the sequencing reaction may be reduced due to specific 
and/or unspecific interactions between the template and 
components of the reaction mixture and/or the surface of 
the vessel holding the reaction mixture. Such 
interaction may result in the reduction of the detected 
signal and/or generation of unspecific sequencing 
signal. SSB is believed to protect the template from 
such undesirable interactions and thus improve signal 
intensity and specificity. Furthermore, the protein may 
assist in "opening-iip" and/or maintaining or stabilising 
the "open" structure of a double-stranded template. 

When using the sequencing methods of the invention 
to detect mutations in a nucleic acid molecule, the 
sequencing reaction may advantageously be run 
bidirectionally to confirm the mutation. 

Moreover, a further beneficial feature of the 
present invention is the stimulation of Klenow 
polymerase which is observable using the single- stranded 
nucleic acid binding protein according to the methods 
described herein. 

Thus in a further aspect, the present invention 
provides a method of maintaining a constant signal 
intensity during a method of nucleic acid sequencing-by- 
synthesis comprising the use of a single- stranded 
nucleic acid binding protein. More particularly, the 
methods comprise including or adding a single-stranded 
nucleic acid binding protein to the sequencing reaction 
mixture (i.e. the template, primer, polymerase and/or 
nucleotide (e .g .dNTP/ddNTP) mix) . 

It is to be understood that 'constant signal 
intensity' in this context means that the strength of 
the signal, however measured, which indicates 
incorporation of the correct base-pair nucleotide 
remains substantially the same throughout the sequencing 
reaction, whether it is the first nucleotide 
incorporated, the twentieth or the sixtieth etc. 
Similarly, the strength of signal indicating 
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incorporation of two nucleotides (ie. two adjacent bases 
are the same in the molecule to be sequenced) remains 
constant throughout the whole sequencing reaction and so 
on for three bases, four bases etc. 

The single-stranded nucleic acid binding protein is 
present in the 'reaction mixture', i.e. together with 
the reagents, enzymes, buffers, primer, sample etc. 
which may include a solid support and is the site of 
polymerisation and also, where appropriate, detection. 

A further benefit of the use of a single -stranded 
nucleic acid binding protein in accordance with the 
present invention is the relatively small amount of 
sample nucleic acid which is required for generation of 
useful sequence information. Approximately 0.05 pmol 
DNA in a 50 fxl reaction is sufficient to obtain sequence 
information which means the quantity of enzymes needed 
for carrying out the extension reactions is less per 
cycle of nucleotide additions and per full sequencing 
reaction. 

In a further aspect, the present invention provides 
a kit for use in a method of sequencing -by- synthesis 
which comprises nucleotides for incorporation, a 
polymerase, means (e.g. any reagents and enzymes needed) 
for detection of incorporation and a single-stranded 
nucleic acid binding protein. 

The invention will now be described by way of non- 
limiting Examples with reference to the Figures in 
which: 

Figure 1 shows a sequencing method of the prior art 
performed on a 130-base-long single- stranded PGR product 
hybridized to the sequencing primer. About 2 pmol of 
the template/primer was used in the assay. The reaction 
was started by the addition of 0.6 nmol of the indicated 
deoxynucleotide and the PPi released was detected. The 
DNA-sequence after the primer is indicated in the 
Figure . 

Figure 2 shows Pyrosequencing of a PGR product a) 
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in the absence of a single-stranded DNA binding protein, 

b) in the presence of SSB from T4 phage (T4gp32) , and 

c) in the presence of SSB from E. coli. Lower quality 
of sequence data is obtained in the absence of SSB as 
indicated by arrows. 

Figure 3 shows a) the sequencing of mutated p53 
template in the absence of a single strand DNA binding 
protein and b) sequencing of the same sequence when SSB 
is included. 

Figure 4 shows a run off extension signal obtained 
on a ISOObp long PGR fragment using Klenow DNA 
polymerase. A) in the presence of SSB and B) in the 
absence of a single -stranded DNA binding protein. 

Figure 5 shows the result of Pyrosequencing an 
800bp long PGR template a) in the presence of SSB and b) 
in the absence of a single -stranded DNA binding protein. 

Figure 6 shows, schematically, inhibition of 
apyrase in the presence of DNA and the ability of SSB to 
reduce the interaction of apyrase with the DNA. 

Figure 7 shows the Pyrosequencing of a 450bp cDNA 
template A) in the absence of a single strand DNA 
binding protein and B) in the presence of SSB. The 
numbers underneath the peaks show the number of bases 
incorporated. In A) the dashed line indicates how the 
strength of signal which results from the incorporation 
of one base decreases as the sequencing reaction 
progresses. In B) the dashed lines indicate how the 
strength of signal as one or two bases are incorporated 
remains constant when the reaction is carried out in the 
presence of SSB. 

EXAMPLE 1 

Gomparison of seauencina using E. cali SSB, SfiR from T4 
Phage (T4ap32) and no SSB ; 



Pyrosequencing was performed on approximately 0.2 pmol 
of a 4 05 -base- long single-stranded PGR product obtained 
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from mitochondrial DNA hybridised to 2 pmol of 
sequencing primer pH3A (5 * -GCTGTACTTGCTTGTAAGC) . Primed 
DNA template together with 2 fig of SSB from E. coli 

(Amersham Pharmacia Biotech, Uppsala, Sweden) or 2 /xg of 
T4gp32 (Amersham Pharmacia Biotech) was added to a four- 
enzyme-mixture comprising 6 U DNA polymerase 

(exonuclease-def icient Klenow DNA polymerase) , 20 mU ATP 
sulfurylase, 200 ng firefly luciferase, and 50 mU 
apyrase . 

The sequencing procedure was carried out by stepwise 
elongation of the primer- strand upon sequential addition 
of the different deoxynucleoside triphosphates 
(Pharmacia Biotech) . The reaction was carried out at 
room temperature. Light was detected by a Pyrosequencer 
(Pyrosequencing AB, Uppsala, Sweden) as described by 
Ronaghi et al . (Science 1998, 281: 363-365). See Fig. 
2. Unincorporated nucleotides were degraded by apyrase 
allowing sequential addition of the four different 
nucleotides in an iterative manner. For this Example 
the correct sequence is: 
5 ' -AGCCCCACCCCCGGGGCAGCGCCAGG. 



EXAMPLE 2 

Detection of Mutations : 



For mutation detection by Pyrosequencing, 2 pmol of 
primer COMP53 ( 5 ' -GCTATCTGAGCAGCGCTCA) was hybridised to 
the immobilised single-stranded PGR products obtained 
from exon 5 of p53 gene from normal and tumour tissues. 
1/10 of the template obtained from a PGR product was 
used in a Pyrosequencing reaction and released light 
detected as described by Ronaghi et al. (Science 1998, 
281: 363-365). There are two altered bases in the 
mutated template whch can be seen by a signal 
corresponding to 1.5 bases for T, 0.5 bases for A and 1 
base for G. See Fig. 3. The sequence for this Example 
is 5 * -CAT(TA/GG)TGGGGG. 
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EXAMPLE 3 

Stimulation bv S SB of Klenow polvmeirase activity . 

A run off extension signal was obtained on a 1500 bp 
long PGR fragment using Klenow DNA polymerase in the 
presence of SSB and in the absence of SSB. PGR was 
performed on 16S rNA gene with ENVl (Ul universal primer 
E. coli positions 8-27) 5 ' -AGAGTTTGATIITGGGTGAG and 
EITV2B (U8 universal primer E. coli positions 1515-1493) 
5' -B-CGGITACCTTGTTACGACTT. After alkali treatment, the 
obtained single-stranded template (1/20 of a PGR 
product) was hybridised to 0 . 5 pmol of ENVl primer for a 
run off extension reaction. 0 . 5 /zg of coli SSB was 
added before extension using a coupled enzymatic 
reaction as described by Nyren et al . (Anal. Biochem. 
1997 244, 367-373). See Fig. 4. 

EXAMPLE 4 

Pvro sequencing of an 80 0 bp long PGR template in the 
presence of 5fi B, and in the absence of SfiR, 

Pyrosequencing was performed on an 800-base-long single- 
stranded PGR product hybridised to the sequencing primer 
FSS-SEQ-DOWN(5*-GTGCTGGGGCCCAGATCTG) . Two ^ig of SSB 
from E. coli was added to the template/primer (1/5 of a 
PGR product in which 5 pmol of primers have been used 
for in vitro amplification) and the obtained complex was 
used in a Pyrosequencing reaction as described by 
Ronaghi et al . (Science 1998, 281: 363-365). The 
sequence obtained by Pyrosequencing using SSB is as 
follows : ATAGGGGTGGGGAATTGGGGGTCGAGGCACGCGGCGGGGGATGGGA 
CTTGGCCGAGGTGTGGTTTTG) . See Fig. 5, 

EXAMPLE 5 

Inhibition of a pvrase bv SSB in presence of DMA 



To a Pyrosequencing mixture containing 200 ng luciferase 
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and 50 mU apyrase, is added 4 pmol of ATP, and a signal 
is obtained. When the degradation curve reaches the 
base-line, 20 pmol of Romo70A (a 70-base-long 
oligonucleotide) is added to the solution (nucleotide 
degradation is inhibited 2.2 times by apyrase shown by a 
longer time needed to level off to the base line) . When 
2 /zg of SSB is added to the solution, the interaction of 
the apyrase with the DNA is diminished and apyrase 
freely functions in the solution and a similar signal as 
before oligonucleotide addition is obtained. Further 
addition of 4 pmol ATP shows approximately the same 
degradation rate as before oligonucleotide addition to 
the solution. SSB is thus effectively able to stimulate 
apyrase activity. See Fig. 6. 

Example 6 

Role of SSB in maintai n ing a constant signal during f-.h^ 
seguencing reaction 

Pyrosequencing of a 450 base-long cDNA template obtained 
by PGR using universal primers was performed in the 
absence of SSB, and in the presence of SSB. The 450bp 
single -stranded PGR product was hybridized to the 
sequencing primer FSS-SEQ-DOWN (5 » -GTGGTGGGGGGGAGATGTG) . 
2 fig of SSB from E.. coli was added to the 

template/primer (1/5 of a PGR product in which 5 pmol of 
primers have been used for in vitro amplification) and 
the obtained complex was used in a Pyrosequencing 
reaction as described by Ronaghi et al . (Science 1998, 
281:363-365) . The sequence obtained by Pyrosequencing 
using SSB is: ATACGGGTGGGGAATTGCCGGGTGGACCCACGCA 
See Fig. 7. 



